Weighted Degenerated Approximate Pattern Matching
نویسندگان
چکیده
We present a bit-parallel approach to degenerated approximate pattern matching problem. That is the problem of finding approximate matches of a “special” pattern in a text of degenerate symbols. The special pattern P = s1 ∗ (a1,b1) . . . sl ∗ (al,bl) sl+1 ∗ (al+1,bl+1) . . . sω, such that symbol ∗ (a,b) is a sequence of at most b but at least a “don’t care” symbols which match any symbol within the alphabet, i.e. a sequence of subpatterns with gaps; the pattern is associated with integer weights in each subpattern sl for replacements, insertions, and deletions. The problem is to match the pattern such that the minimum sum of weights is achieved. The total time complexity is (k(log(k+2)+1)mn)/w, where m is the length of the pattern P , n is the length of text of degenerate symbols, k is the maximum number of edit operations performed, and w is the length of the computer word.
منابع مشابه
Pattern Matching on Weighted Sequences
Weighted sequences are used extensively as profiles for protein families, in the representation of binding sites and often for the representation of sequences produced by a shotgun sequencing strategy. We present various fundamental pattern matching problems on weighted sequences and their respective algorithms. In addition, we define two matching probabilistic measures and we give algorithms f...
متن کاملThe matching interdiction problem in dendrimers
The purpose of the matching interdiction problem in a weighted graph is to find two vertices such that the weight of the maximum matching in the graph without these vertices is minimized. An approximate solution for this problem has been presented. In this paper, we consider dendrimers as graphs such that the weights of edges are the bond lengths. We obtain the maximum matching in some types of...
متن کاملProperty Matching and Weighted Matching
In many pattern matching applications the text has some properties attached to various of its parts. Pattern Matching with Properties (Property Matching, for short), involves a string matching between the pattern and the text, and the requirement that the text part satisfies some property. Some immediate examples come from molecular biology where it has long been a practice to consider special ...
متن کاملApproximate string matching as an algebraic computation
Approximate string matching has a long history and employs a wide variety of methods (see e.g. the survey [2]). We consider a variant of approximate matching that compares a fixed pattern string to every substring in the text string by a rational-weighted edit distance (e.g. the indel distance, defined as the number of character insertions and deletions, or the indelsub/Levenshtein distance, wh...
متن کاملParallel Algorithms for Degenerate and Weighted Sequences Derived from High Throughput Sequencing Technologies
Novel high throughput sequencing technologies have redefined the way genome sequencing is performed. They are able to produce millions of short sequences in a single experiment and with a much lower cost than previous methods. In this paper, we address the problem of efficiently mapping and classifying millions of degenerate and weighted sequences to a reference genome, based on whether they oc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007